Dependency-Based Bracketing Transduction Grammar for Statistical Machine Translation
نویسندگان
چکیده
In this paper, we propose a novel dependency-based bracketing transduction grammar for statistical machine translation, which converts a source sentence into a target dependency tree. Different from conventional bracketing transduction grammar models, we encode target dependency information into our lexical rules directly, and then we employ two different maximum entropy models to determine the reordering and combination of partial dependency structures, when we merge two neighboring blocks. By incorporating dependency language model further, large-scale experiments on Chinese-English task show that our system achieves significant improvements over the baseline system on various test sets even with fewer phrases.
منابع مشابه
Linear Transduction Grammar Alignments as a Second Translation Path
Stochastic Bracketing Linear Tranduction Gramamrs are used to create an additional phrase table, which is used as a second translation path through a Phrasebased Statistical Machine Translation System. The rationale for transduction grammars, the details of the system and some results are presented.
متن کاملA Polynomial-Time Algorithm for Statistical Machine Translation
We introduce a polynomial-time algorithm for statistical machine translation. This algorithm can be used in place of the expensive, slow best-first search strategies in current statistical translation architectures. The approach employs the stochastic bracketing transduction grammar (SBTG) model we recently introduced to replace earlier word alignment channel models, while retaining a bigram la...
متن کاملFertility-based Source-Language-biased Inversion Transduction Grammar for Word Alignment
We propose a version of Inversion Transduction Grammar (ITG) model with IBM-style notation of fertility to improve word-alignment performance. In our approach, binary context-free grammar rules of the source language, accompanied by orientation preferences of the target language and fertilities of words, are leveraged to construct a syntax-based statistical translation model. Our model, inheren...
متن کاملAutomatic Spoken Language Translation Template Acquisition Based on Boosting Structure Extraction and Alignment
In this paper, we propose a new approach for acquiring translation templates automatically from unannotated bilingual spoken language corpora. Two basic algorithms are adopted: a grammar induction algorithm, and an alignment algorithm using Bracketing Transduction Grammar. The approach is unsupervised, statistical, data-driven, and employs no parsing procedure. The acquisition procedure consist...
متن کاملMachine Translation with a Stochastic Grammatical Channel
We introduce a stochastic grammatical channel model for machine translation, that synthesizes several desirable characteristics of both statistical and grammatical machine translation. As with the pure statistical translation model described by Wu (1996) (in which a bracketing transduction grammar models the channel), alternative hypotheses compete probabilistically, exhaustive search of the tr...
متن کامل